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The present, invention is related to that disclosed in United States Provisional 
Patent Apphcation No. 60/108,939, filed on November 18, 1998, entitled "SCALABLE 
VIDEO STREAMING USING MPEG-4", which is commonly assigned to the assignee of the 
present invention. The disclosure of this related provisional patent application is incorporated 
herein by reference for all purposes as if fully set forth herein. 

TECHNICAL FIELD OF THE INVENTION 

The present invention is directed, in general, to video processing systems and, 
more specifically, to a decoder buffer for use in a streaming video receiver. 



BACKGROUND OF THE INVENTION 

Real-time streaming of multimedia content over Internet protocol (IP) networks 
has become an increasingly common application in recent years. A wide range of interactive 
and non-interactive multimedia Internet applications, such as news con-demand, live TV 

15 viewing, video conferencing, and many others rely on end-to-end streaming solutions. Unlike 
a "downloaded" video file, which may be retrieved first in "non-real" time and viewed or 
played back later, streaming video apphcations require a video source to encode and to 
transmit a video signal over a network to a video receiver, which must decode and display the 
video signal in real time. The receiver relies on a decoder buffer to receive encoded video data 

20 packets from the network and to transfer the packets to a video decoder. 

Two problems arise when a streaming video signal is transmitted across a non- 
guaranteed Quality-of-Service (QoS) network, such as the Internet. First, end-to-end variations 
in the network (e.g., delay jitter) between the streaming video transmitter and the streaming 
video receiver mean that the end-to-end delay is not constant. Second, there is usually a 

25 significant packet loss rate across non-QoS networks, often requiring re-transmission. The lost 
data packet must be recovered prior to the time the corresponding frame must be decoded. If 
not, an underflow event occurs. Furthermore, if prediction-based compression is used, an 
underflow due to lost data packets may not only impact the current frame being processed, but 
may affect many subsequent frames. 

CONFIRMATION COPY 
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It is well-known that re-transmission of lost packets is a viable means of 
recovery for continuous media communication over packet networks. Many applications use a 
negative automatic repeat request (NACK) in conjunction with re-transmission of the lost 
packet. These approaches take into consideration both the round-trip delay and the delay jitter 
5 between the sender and the receiver(s). 

For example, an end-to-end model with re-transmission for packet voice 
transmission has been developed. This model takes advantage of the fact that voice data 
consists of periods of silence separated by brief talk-spurt segments. The model also assumes 
that each talk-spurt consists of a fixed number of fixed-size packets. However, this model is 
10 not general enough to capture the characteristics of compressed video (which can have 
variable number of bytes or packets per video frame). 

There is therefore a need in the art for improved streaming video receivers that 
compensate for variations inherent in a non-QoS network. In particular, there is a need for an 
improved receiver decoder buffer that takes into consideration both transport delay parameters 
1 5 (e.g., end-to-end delay and delay jitter) and video encoder buffer constraints. More 

particularly, there is a need for an improved decoder buffer that eliminates the separation 
between the network transport buffer, which is typically used to remove delay jitter and to 
recover lost data, and the video decoder buffer. 

20 SUMMARY OF THE INVENTION 

The present invention is embodied in an Integrated Transport Decoder (ITD) 
buffer model. One key advantage of the ITD model is that it eliminates thrseparation of a 
network-transport buffer, which is typically used for removing delay jitter and recovering lost 
data, from the video decoder buffer. This can significantly reduce the end-to-end delay, and 

25 optimize the usage of receiver resources (such as memory). 

It is a primary object of the present invention to provide, for use with a video 
decoder capable of decoding streaming video, a decoder buffer capable of receiving from a 
streaming video transmitter data packets comprising the streaming video and storing the data 
packets in a plurality of access units. Each of the access units is capable of holding at least one 

30 data packet associated with a selected frame in the streaming video. The decoder buffer 

comprises: 1) a first buffer region comprising at least one access unit capable of storing data 
packets that are less immediately needed by the video decoder, and 2) a re-transmission region 
comprising at least one access unit capable of storing data packets that are most immediately 
needed by the video decoder, wherein the decoder buffer, in response to a detection of a 
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missing data packet in the re-transmission region requests that the streaming video transmitter 
retransmit the missing packet. 

In one embodiment of the present invention, at least one of the data packets are 
stored in the first buffer region for a period of time equal to a start-up delay time of the 
5 decoder buffer. . 

In another embodiment of the present invention, the data packets are first stored 
in the first buffer region and are shifted into the re-transmission region. 

In still another embodiment of the present invention, the first buffer region is 
separate from the re-transmission region. 
10 In yet another embodiment of the present invention, the first buffer region 

overlaps at least a portion of the re-transmission region. 

In a further embodiment of the present invention, the first buffer region 
overlaps all of the re-transmission region. 

In a further embodiment of the present invention, the first buffer region is 
15 separated from the re-transmission region by a second buffer region in which a late data packet 
is late with respect to an expected time of arrival of the late data packet, but is not sufficiently 
late to require a re-transmission of the late data packet. 

The foregoing has outlined rather broadly the features and technical advantages 
of the present invention so that those skilled in the art may better understand the detailed 
20 description c5f the invention that follows. Additional features and advantages of the invention 
will be described hereinafter that form the subject of the claims of the invention. Those skilled 
in the art should appreciate that they may readily use the conception and th'e specific 
embodiment disclosed as a basis for modifying or designing other structures for carrying out 
the same purposes of the present invention. Those skilled in the art should also realize that 
25 such equivalent constructions do not depart from the spirit and scope of the invention in its 
broadest form. 

Before undertaking the DETAILED DESCRIPTION, it may be advantageous to 
set forth definitions of cenain words and phrases used throughout this patent document: the 
terms "include" and "comprise," as well as derivatives thereof, mean inclusion without 
30 limitation; the term "or," is inclusive, meaning and/or; the phrases "associated with" and 
"associated therewith," as well as derivatives thereof, may mean to include, be included 
within, interconnect with, contain, be contained within, connect to or with, couple to or with, 
be communicable with, cooperate with, interleave, juxtapose, be proximate to, be bound to or 
with, have, have a property of, or the like; and the term "controller" means any device, system 
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or part thereof that controls at least one operation, such a device may be implemented in 
hardware, firmware or software, or some combination of at least two of the same. It should be 
noted that the functionality associated with any particular controller may be centralized or 
distributed, whether locally or remotely. Definitions for certain words and phrases are 
provided throughout this patent document, those of ordinary skill in the art should understand 
that in many, if not most instances, such definitions apply to prior, as well as future uses of 
such ditfined words and phrases. 

BRIEF DESCRIPTION OF THE DRAWINGS 

For a more complete understanding of the present invention, and the advantages 
thereof, reference is now made to the following descriptions taken in conjunction with the 
accompanying drawings, wherein like numbers designate like objects, and in which: 

FIGURE 1 illustrates an end — to-end transmission of streaming video from a 
streaming video transmitter through a data network to an exemplary streaming video receiver 
according to one embodiment of the present invention; 

FIGURE 2 illustrates an ideal encoder-decoder model of a video coding system; 

FIGURE 3 illustrates end-to-end transmission of streaming video from a 
compressed video source through a channel to an exemplary integrated transport decoder 
buffer and video decoder, without support for re-transmission, according to one embodiment 
of the present invention. 

FIGURE 4 illustrates a sequence diagram showing the flow of data packets 
through different and distinct regions of exemplary ideal integrated transport decoder buffer. 

FIGUIIE 5 illustrates a sequence diagram showing the flow of data packets 
through different over-lapping regions of exemplary integrated transport decoder buffer 
configured for the maximum outer boundary range. 

DETAILED DESCRIPTION 

FIGURES 1 through 5, discussed below, and the various embodiments used to 
describe the principles of the present invention in this patent document are by way of 
illustration only and should not be construed in any way to limit the scope of the invention. 
Those skilled in the art will understand that the principles of the present invention may be 
implemented in any suitably arranged streaming video receiver. 

/additionally, those skilled in the art will readily understand that while the 
embodiment of I he present invention described below is principally oriented towards 



wo 00/30356 PCT/EP99/08927 

5 

streaming video, this is by way of illustration only. In fact, the improved integrated transport 
decoder buffer described below may be readily adapted for use in connection with streaming 
audio data or other streaming data that must be supplied to a decoder at a required rate. 

FIGURE 1 illustrates an end — to-end transmission of streaming video from 
5 streaming video transmitter 110 through data network 120 to streaming video receiver 130, 
according to one embodiment of the present invention. Depending on the application, 
streaming video transmitter 110 may be any one of a wide variety of sources of video frames, 
including a data network server, a television station, a cable network, a desktop personal 
computer (PC), or the like. Streaming video transmitter 1 10 comprises video frame 

10 source 1 12, video encoder 1 14 and encoder buffer 116. Video frame source 1 12 may be any 
device capable of generating a sequence of uncompressed video frames, including a television 
antenna and receiver unit, a video cassette player, a video camera, a disk storage device 
capable of storing a "raw" video clip, and the like. 

The uncompressed video frames enter video encoder 1 14 at a given picture rate 

15 (or *'streaming rate") and are compressed according to any known compression algorithm or 
device, such as an MPEG-4 encoder. Video encoder 1 14 then transmits the compressed video 
frames to encoder buffer 1 16 for buffering in preparation for transmission across data 
network 120. Data network 120 may be any suitable IP network and may include portions of 
both public data networks, such as the Internet, and private data networks, such as an 

20 enterprise-ovi/ned local area network (LAN) or wide area network (WAN). 

Streaming video receiver 130 comprises decoder buffer 131, video decoder 134 
and video display 136. Decoder buffer 131 receives and stores streaming compressed video 
frames from data network 120. Decoder buffer 131 then transmits the compressed video 
frames to video decoder 134 as required. Video decoder 134 decompresses the video frames at 

25 the same rate (ideally) at which the video frames were compressed by video encoder 1 14. 

Decoder buffer 131 further comprises integrated transport decoder (ITD) 
buffer 132, ITD buffer monitor 138 and re-transmission controller 139. In accordance with the 
principles of the present invention, ITD buffer 132 integrates both temporal and data-unit 
occupancy considerations in order to provide, video decoder 134 with compressed video 

30 frames at a rate that is sufficient to avoid underflow conditions, during which video decoder 
134 is starved for compressed video frames. 

ITD buffer 132 accomplishes this in cooperation with ITD buffer monitor 138 
and re-transmission controller 139. ITD buffer monitor 138 monitors the level of data- 
occupancy in ITD buffer 132 and detects missing data packets and potential underflow 
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conditions. In response to notification from ITD buffer monitor 138, re-transmission 
controller 139 requests re-transmission of data missing from ITD buffer 132 in order to avoid 
underflow conditions. In an advantageous embodiment of the present invention, ITD 
buffer 132, ITD buffer monitor 138, and re-transmission controller 139 are implemented in a 
personal computer (PC) that receives streaming video and/or audio from, for example, the 
Internet over a high-speed data line. In such an embodiment, ITD buffer 132 may be 
implemented in main random access memory (RAM) of the PC or in RAM on a video card, 
and ITD buffer monitor 138 and re-transmission controller 139 may be implemented in the 
CPU of the PC. To implement ITD buffer 132 in a PC environment, ITD buffer 132 may be 
embodied as computer executable instructions stored as a program on storage media 140, such 
as a CD-ROM, computer diskette, or similar device, that may be loaded into removable disk 
port 141 in streaming video receiver 130. 

Continuous decoding of compressed video frames is a key requirement of a 
real-time multimedia application, such as streaming video. To meet this requirement, a 
decoder-encoder buffer model is normally used to ensure that underflow and overflow events 
do not occur. These constraints limit the size (bit-wise) of video pictures that enter the encoder 
buffer. The constraints are usually expressed in terms of encoder-buffer bounds, which when 
adhered to by the encoder, guarantee continuous decoding and presentation of the compressed 
video stream at the receiver. 

FIGURE 2 shows an ideal encoder-decoder model of a video coding system. 
Under this ideal model, uncompressed video frames 201-203 enter the compression engine of 
encoder 214 at a given picture-rate, X frames/second, as indicated by the Time(l) line. The 
compressed framesjsxit encoder 214 and enter encoder buffer 216 at the same X 
frames/second, as indicated by the Time(2) line. Similarly, the compressed frames exit 
decoder buffer 216 and enter channel 220 at X frames/second. Channel 220 is a generic 
representation of any transmission medium, such as the Internet, that transfers compressed 
video frames from a transmitting source to a receiver. In the ideal case, the delay of 
channel 220 (5c) is a constant value. 

Next, the compressed frames exit channel 220 and enter decoder buffer 232 at 
the same X frames/second as at the input and the output of encoder 214, as indicated by the 
Time(3) line. Decoder buffer 232 transmits the compressed frames to decoder 234, which 
decompresses the frames and outputs decompressed frames 251-253 at the original X 
frames/second at which frames entered encoder 214. 



wo 00/30356 PCT/EP99/08927 

7 

Ideally, the end-to-end buffering delay (i.e., the total delay encountered in both 
encoder buffer 216 and decoder buffer 232) is constant. However, the same piece of 
compressed video data (e.g., a particular byte of the video stream) encounters different delays 
in encoder buffer 216 and decoder buffer 232. In the ideal model, encoding in encoder 214 and 
5 decoding in decoder 234 are instantaneous and require zero execution time and data packets 
are not lost. 

The encoder buffer bounds can be expressed using discrete-time summation. In 
discrete-time dom;iin analysis, A is the end-to-end delay (i.e., including both encoder 
buffer 216 and decoder buffer 232 and channel delay 6c) in units of time. For a given video 
10 coding system, A is a constant number applicable to all frames entering the encoder-decoder 
buffer model. 

To simplify the discrete-time analysis, it is assumed that the end-to-end 
buffering delay (AT=A-5c) is an integer-multiple of the frame duration (T). Therefore, 
NA=N(A-5c)/T represents the delay of the encoder and decoder buffers in terms of the number 

15 of video frames (N). For the purposes of clarity and brevity in describing the principles of the 
present invention, the remainder of this disclosure will use time units specified in frame- 
duration intervals. For example, using the encoder time reference shown in FIGURE 2, the n* 
frame enters encoder buffer 216 at time index "n". The decoder time-reference of 
decoder buffej 232 is shifted by the channel delay (5c). with respect to encoder buffer 216. 

20 The data rate (r) at the output of encoder (e) 214 during frame-interval "i" may 

be represented as r^(i). Here, "data rate" is used generically. It could signify bit rate, byte rate, 
or even packet rate. Similarly, the data rate at the input of decoder buffer 232 may be 
represented as r^(i)?Based on the ideal model, r^(iT) = r*^(iT+6c). In addition, based on the 
convention established above, r^(i) = r^(i). Thus, the bounds of encoder buffer 216 can be 

25 expressed as: 



max 



n + AN 

















< B®(n) < inin 

















Equation. 1 



where and are the maximum decoder and encoder buffer sizes respectively. 

30 In the ideal case, it is also assumed that encoder 214 starts transmitting data 

immediately after the first frame enters encoder 214. Therefore, the start-up delay ddf (i.e., the 
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delay time the first piece of data from the first picture spends in decoder buffer 232 prior to 
decoding) equals the end-to-end, encoder-decoder buffer delay: ddf = AT=T-AN. 

In one embodiment of the present invention, ITD buffer 132 minimizes 
underflow events by taking into consideration the above-described problems of the ideal 
5 buffer model and the ideal encoder-decoder buffer constraints. UD buffer 132 is based on lost 
packet recovery using re-transmission. 

FIGURE 3 is a simplified block diagram of exemplary end-io-end transmission 
of streaming video, without support for re-transmission. For the purposes of simplicity and 
clarity, streaming video transmitter 110 has been replaced by compressed video source 305 
10 and data network 120 has been replaced by channel 320. Compressed video source 305 

transmits data packets at rate r\n) and channel 320 transmits data packets at rate T^\n). Since 
video re-transmission is not supported for this embodiment, ITD buffer monitor 138 and re- 
transmission controller 139 are omitted from the diagram. Streaming video receiver 130 has 
been simplified and is represented by ITD buffer 132 and video decoder 134. 
^5 As noted above, rXD buffer 132 integrates temporal and data-unit occupancy 

models. ITD buffer 132 is divided into temporal segments of T* seconds each. By way of 
example, the parameter T may be the frame period in a video sequence. The data packets (bits, 
bytes, or packets) associated with a given duration T are buffered in the corresponding 
temporal segment. All of the data packets associated with a temporal unit are referred to as an 
20 "access" unit. By way of example, data packets 351, 352, and 353 comprise access unit A„+,, 
data packet 354 comprises access unit An+2, and data packets 355 and 356 comprise access unit 

An+3. 

During time interval n, the n^^ access unit, An, is being decoded by decoder 134 
and access unit A^+i is stored at the temporal segment nearest to the output of ITD buffer 132. 

25 An access unit may be an audio frame, a video frame, or even a portion of a video frame, such 
as Group of Blocks (GOB). Therefore, the duration required to decode or display an access 
unit is the same as the duration of the temporal segment T. During the time-interval n, the rate 
at which data enters ITD buffer 132 is r'*^(n). The number of data packets in each access unit 
are not required to be the same. Compression algorithms used in video encoder 1 14 may 

30 compress the data packets in successive access units by different amounts, even though each 
access unit represents temporal units of the same duration. 

For example, the three data packets 351-353 in access unit An+i may comprise a 
complete video frame. Frame 1. The single data packet 354 in An+2 may represent only those 
portions of Frame 2 that are different than Frame 1. Nonetheless, data packet 354 is sufficient 
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to create Frame 2 if the Frame 1 data is already known. Since Frame 1 and Frame 2 have the 
same duration, the temporal segment, T, is the same for An^i and An+z- 

Each temporal segment holds a maximum number of packets, Kmax, with each 
packet having a maximum size, bmax, (in bits or bytes). Therefore, the maximum size of an 
5 access unit, Smax. may be represented by Smax^naxCbma*). Video encoder 1 14 is assumed to 
begin each access-unit with a new packet that is present only in that access unit. 

The amount of data in ITD buffer 132 at time index n, B^\n), may be described 
by terms of B*(n) and B^'Cn). B*(n) represents the number of consecutive-and-complete access 
units in ITD buffer 132 at the beginning of interval n, and B^(n) represents the total 

10 consecutive amount of data in ITD buffer 132 at the end of interval n. For B*(n), temporal 

segments containing partial data are not counted, and all segments following a partial segment 
are also not counted even if they contain a complete, access-unit worth of data. Hence, T-B*(n) 
represents how much video in temporal units (e.g. seconds) that the ITD buffer 132 holds at 
time index n (without running into an underflow if no more data arrives), 

15 Therefore, if Sn denotes the size of access unit n, the relationship between B^ 

and B** can be expressed as Equation 2 below: 

Equation 2 

where Sj is the maximum size of the access unit for temporal segment j and UB^n)+i is the 
20 partial (incomplete) data of access unit An+B^n)+l which is stored in temporal segment B^(n)+1 
at the beginning of time index n. 

Wl^n re-transmission is supported as an embodiment, ITD buffer 132 requires 
capability for a) outputting one temporal segment (T) worth of data at the beginning of every 
temporal time-interval n; b) detecting lost packet(s) and transmitting associated negative 
25 acknowledge (NACK) messages to die transmitter 1 10 or 305; c) continuously storing newly 
arrived primary (i.e., not re-transmitted) packets; and d) storing re-transmitted packets. The 
ideal ITD buffer 132 maintains the data rate of the video stream, without delays caused by re- 
transmission of any lost data. In other words, if r*(n) is the transmission data rate used by an 
idealized video encoder 114 under lossless circumstances, ideal ITD buffer 132 will maintain 
30 this date rate without degradation caused by the re-transmission process. Depending upon the 
number of re-transmission requests, encoder buffer 1 16 may adjust its output data rate r*(n), 
with a corresponding adjustment by ITD buffer 132. 
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In one embodiment, decoder buffer 131 adds buffering for the incoming video 
stream in order to compensate for the time required for detection and recovery of lost data and 
for the delay associated with a "real" world implementation. By delaying all incoming video 
streams by this compensation time, decoder buffer 131 outputs video stream data at a 
5 continuous rate as required for decoding. Re-transmission controller 139 and ITD buffer 132 
incorporate processes for mininfiizing the time for detecting the absence of packets and 
transfeiring NACKs for re-transmission by streaming video transmitter 1 10. The minimum 
duration of time needed for detecting a predetermined number of lost packets is represented by 
Tl. In general, Tl is a function of the delay jitter caused by data arriving later than expected by 

10 ITD buffer 132. 

The minimum amount of time needed for streaming video receiver 130 to 
recover a packet after being declared lost is represented by Tr. Time Tr includes the time 
required for streaming video receiver 130 to send a NACK to streaming video transmitter 110 
and the time needed for the re-transmitted data to reach streaming video receiver 130 

15 (assuming that the NACK and re-transmitted data are not lost). 

Exemplary decoder buffer 131 transfers a re-transmitted packet with a 
minimum delay (Tl+Tr) for the lost packet interval. If the minimum delay experienced by any 
video data for an ideal decoder buffer 13 1 is represented by ddmin, the amount of delay Ar that 
may be added to the minimum ideal delay in order to account for the total delay for re- 

20 transmission is: 



where u(x)=:x for x>0, and u(x)=0 for x < 0. 
25 Decoder buffer 131 adds delay Ar buffering for all output data to video decoder 

134 in order to provide time for decoding and transferring of the data, resulting in continuous 
video streams. Therefore, the total encoder buffer 1 16 to decoder buffer 132 output delay 
(ATOT)may be represented by: 

30 Atot = Aideai + Ar > Ajdcai + u(TL+TR-ddmin) Equation 4 

ITD buffer 132 provides buffering (storage) for a minimum number of temporal 
segments ( ) as compensation for re-transmission time requirements and as prevention for 
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an underflow event. The JTD buffer 132 sizing may be based, for example, on minimum and 
maximum boundaries for storing temporal segments. The process for determining these 
boundaries is described in the following paragraphs. 

In the absence of lost packets and delay jitter, at any time index n. the ITD 
5 buffer 132 provides the following occupancy capability: 



Equation 5 



An ideal ITD buffer 132 has a maximum decoding delay (ddnm), where 
10 ddmax ^ Aidcai. Consequently, in the absence of lost packets and delay jitter, ideal ITD buffer 
132 satisfies the following requirement: 



Equation 6 



15 Further, in the absence of lost data and delay jitter, the ideal ITD buffer 132 provides storage 
requirements for TB*(n) dau, bounded as follows: 



Equation 7 



20 ITD buffer 132 storage capability with consideration for delay jitter may be 

expressed as: 



<T ^ B'{n) < dd^^ + u(T^ + T, - dd^„) + 



Equation 8 



25 where Te is the delay jitter associated with packets arriving earlier than expected to ITD buffer 
132. Therefore, if is the maxinrium number of temporal segments that ITD buffer 132 

holds, then: 



^ • -B^c ^ dd^x + + r« « dd^J + 

30 or Equation 9 
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ITD buffer 132 storage capability is based on the above equations, minimum 
ideal storage requirements, and delays associated with data transfers. ITD buffer 132 has a 
minimum size determined by ideal encoder buffer 116 which is represented by b^^^ . ITD 

buffer 132 provides added storage to adjust for delays introduced by ITD buffer 132 and for 
data arriving earlier than expected ITD buffer 132 storage requirements (in temporal units) for 
acconmiodation of these exemplary delays is represented by Tcxtra» as shown below. 



Using this relationship, ITD buffer 132 storage requirement for satisfying the 
B^'max upper limit (in temporal units), is shown by the following upper boundary relationship: 

bL ^ B^, + R^^ . r^,,, = + - dd^n) + T,] Equation 11 

An ideal ITD buffer 132 has a minimum decoding delay (ddmn) which is equal 
to zero and a maximum decoding delay (dd^ax) which is equal to the ideal end-to-end 
buffering delay (Ajdeai). The ideal ITD buffer 132 is sized to provide extra minimum delay that 
is equal to Tl + Tr 7 where Tl and Tr are assumed to be integer-multiples of the duration T. 
The minimum time delay requirement is found by substituting the ideal buffer region ddmin=0 
and dmax=Aideai inio previously described equation for Atot. This extra buffer requirement 
stores Nl + Nr temporal segments, where Nr=Tr/T and N=Ti/r. Thus, ideal ITD buffer 132 is 
found to provide storage for the following number of temporal segments: 



BL. ^N,.N,. [(T. ^ dd».) / T] ^^^^^^ j2 



Since the maximum decoding delay, ddmax^AjdcaF AT, corresponds to AN 
30 temporal segments, B^max is funher described as follows: 



2.^ 
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where Ne=[Te/T1. 

FIGURE 4 is a sequence diagram showing the flow of data packets through 
5 different regions of exemplary ITD buffer 132 under the assumption that ddmin = 0 (the lower 
boundary level) and dmax = Aidcai- ITD buffer 132 data enters from the right side of the diagram 
and exits to the video decoder 134 at the left side. The most recently received data is in a 
buffer area which is labeled "too-early for re-transmission request region" (too-early). 
Depending on the location in the too-early region of the buffer, ITD buffer 132 introduces 

10 buffer delays labeled Ne, AN, or Nl. The area of this too-early buffer region which comprises 
the ideal delay AN, is labeled as the ideal-buffer region. UD buffer 132 manages the ideal- 
buffer region as an ideal video buffer, i.e., data packets flow through this region and are only 
delayed by the inherent characteristics of the buffer element(s). Ideal ITD buffer 132 provides 
the remaining too-early buffer areas to compensate for delays associated with the transfer of 

15 video streams from streaming video transmitter 1 10 to decoder 131 (Ne). as well as delays 
caused by delayed or lost video packets (Nl)« 

ITD buffer 132 provides delay Nr in the re-transmission region in order to 
compensate for expected time requirements for the initiation and reception of re-transmission 
requests. Exemplary decoder buffer 131 initiates re-transmission requests during the time 

20 periods associated with the re-transmission region. 

It is important to note that the ideal-buffer and re-transmission regions may 
overiap, depending on the values of the different delay parameters (ddmim Tr, Tl). However, 
for the exemplary i'^al ITD buffer 132 with ddmin=0, the re-transmission and ideal-buffer 
regions do not overlap. 

25 For ITD buffer 132, Ne represents the initial decoding delay (ddf) which 

corresponds to the amount of delay encountered by the very first piece of data that enters the 
buffer prior to the decoding of the first picture (or access unit). This ddf is based on, among 
other things, the streaming video transmitter 110 and data network 120 data transmission rates 
during elapsed lime ddf. In the ideal case, ITD buffer 132 uses this same data rate for entering 

30 received data into its buffer (storage) regions. Ideal decoder buffer 131 recognizes the amount 
of data in its ITD buffer 132 regions just prior to the time that the first access unit is decoded 
as ^ data This ^^^^ referred to as "start-up-delay" data, is determined from the 

following relationship: 
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^"^ Equation 14 



10 



When ddnun=0, ideal decoder buffer 131 re-transmission processing is 
comprised of the following procedures: 

The ideal-buffer region is filled until all data associated with the start-up delay are in the 
buffer. Since lost events may also occur during this time interval, these data may be treated in 
a special way, such as by using reliable transmission (e.g. using TCP) for them. The ideal 
condition for lossless data is satisfied when: 



Equation 15 



where is the amount of data stored in ideal ITD buffer 132 temporal segment k at any 
instant of time. 

15 2. After Equation 15 is satisfied, ITD buffer 132 advances the content of 

all temporal storage segments by one segment toward the buffer output. Subsequently, ideal 
ITD buffer 132 repeats this process every T units of time. After Nl+Nr periods of T (i.e. after 
Tl+Tr), decoder 134 starts decoding the first access unit. The time-periodHiat starts when 
decoding of the first access unit begins is labeled Ti. Hence, the beginning of any time period 

20 n (Tn) represents the time when access unit An+k is moved to temporal segment k. 

Ideal ITD buffer 132 considers data missing in temporal segment Nr of the re-transmission 
buffer region as lost. This condition occurs when: 



25 



Equation 16 



where B,., ^) is die amount of data in temporal segment Nr at time period n and Sj is the size 

of access unit j. When ideal ITD buffer 132 determines that data is missing, it sends a re- 
transmission request to streaming video transmitter 1 10. 
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4. Ideal ITD buffer 132 places arriving re-transmitted data into their 
corresponding temporal segments of the re-transmission region. Assuming the re-transmilted 
data are received, ideal ITD buffer 132 transfers the re-transmitted data to the video decoder 
134 prior to the decoding times of their corresponding access units. 
5 FIGURE 5 is a sequence diagram showing the flow of data packets through 

different regions of exemplary ITD buffer 132 with over-lap between the ideal buffer, Nl, and 
re-transmission regions. For this case, ITD buffer 132 is configured for the maximum outer 
boundary where ddmin ^ Tl+Tr, causing its ideal-buffer region to totally over-lap its re- 
transmission region. Thus, decoder buffer 131 transfers the received video stream to video 
10 decoder 134 after all of the data associated with the start-up delay arrives. Then, video decoder 
134 decodes the first access unit without further delays. Decoder buffer 131 perfornis the re- 
transmission function as previously described. 

In a similar manner, decoder buffer 131 provides data transfer between 
streaming video transmitter 1 10 and video decoder 134 for the general case when ddmin has a 
15 value between the niinimum and maximum boundary areas (i.e., when 0<ddnun<TL+TR), with 
an additional delay of (TL+TR-ddmin). 

Although the present invention has been described in detail, those skilled in the 
ait should understand that they can make various changes, substitutions and alterations herein 
without departing from the spirit and scope of the invention in its broadest form. 



BNSDCXJID: «WO ^0030356*1 J_>. 



wo 00/30356 PCT/EP99/08927 

16 

CLAIMS: 



1. For use with a video decoder (134) capable of decoding streaming video, a 
decoder buffer (132) capable of receiving from a streaming video transmitter data packets 
(351) comprising said streaming video and storing said data packets (351) in a plurality of 
access units, each of said access units capable of holding at least one data packet associated 

5 with a selected frame in said streaming video, wherein said decoder buffer (132) comprises: 
a first buffer region comprising at least one access unit capable of storing data 
packets (351) that are less immediately needed by said video decoder (134); and 

a re-transmission region comprising at least one access unit capable of storing 
data packets (351) that are most inrmiediately needed by said video decoder (134), wherein 
10 said decoder buffer (132), in response to a detection of a missing data packet in said re- 
transmission region requests that said streaming video transmitter retransmit said missing 
packet. 

2. The decoder buffer (132) set forth in Claim 1 wherein at least one of said data 
15 packets (351) are stored in said first buffer region for a period of time equal to a start-up delay 

time of said decoder buffer (132). 

3. The decoder buffer (132) set forth in Claim 1 wherein said data packets (351) 
are first stored in said first buffer region and are shifted into said re-transmission region. 

20 

4. The decoder buffer (132) set forth in Claim 1 wherein said first buffer region is 
separate from said re-transmission region. 

5. The decoder buffer (132) set forth in Claim 1 wherein said first buffer region 
25 overlaps at least a portion of said re-transmission region. 

6- The decoder buffer (132) set forth in Claim 5 wherein said first buffer region 

overlaps all of said re-transmission region. 
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7. The decoder buffer (132) set forth in Claim 1 wherein said first buffer region is 
separated from said re-transmission region by a second buffer region in which a late data 
packet is late with respect to an expected time of arrival of said late data packet, but is not 
sufficiently late to require a re-transmission of said late data packet. 

5 

8. A receiver capable of receiving encoded streaming data comprising: 

a device (136) capable of at least one of: 1) displaying streaming video data 
associated with said encoded streaming data and 2) audibly playing streaming audio data 
associated with said encoded streaming data; 
10 a decoder (134) capable of decoding said encoded streaming data; and 

a decoder buffer (132) capable of receiving from a streanwng data transmitter 
data packets (351) comprising said encoded streaming data and storing said data packets (351) 
in a plurality of access units, each of said access units capable of holding at least one data 
packet associated with a selected portion of said encoded streaming data, wherein said decoder 
15 buffer (132) comprises: 

a first buffer region comprising at least one access unit capable of storing data 
packets (351) that are less immediately needed by said decoder (134); and 

a re-transmission region comprising at least one access unit capable of storing 
data packets (351) that are most immediately needed by said decoder (134), wherein said 
20 decoder buffer (132), in response to a detection of a missing data packet in said re- 
transmission region requests that said streaming video transmitter retransmit said missing 
packet. .-ce . . 

9. The receiver set forth in Claim 8 wherein at least one of said data packets (351) 
25 are stored in said first buffer region for a period of time equal to a start-up delay time of said 

decoder buffer (132). 

10. The receiver set forth in Claim 8 wherein said data packets (351) are first stored 
in said first buffer region and are shifted into said re-transmission region. 

30 

11. The receiver set forth in Claim 8 wherein said first buffer region is separate 
from said re-iransmission region. 
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12. The receiver set forth in Claim 8 wherein said first buffer region overlaps at 
least a portion of said re-transmission region. 

13. The receiver set forth in Claim 12 wherein said first buffer region overlaps all 
5 of said re-transmission region. 

14. The receiver set forth in Claim 8 wherein said first buffer region is separated 
from said re-transmission region by a second buffer region in which a late data packet is late 
with respect to an expected time of arrival of said late data packet, but is not sufficiently late to 

10 require a re-transmission of said late data packet. 

15. For use with a video decoder (134) capable of decoding streaming video, a 
method of buffering the streaming video comprising the steps of: 

receiving from a streaming video transmitter data packets (351) comprising the 
15 streaming video and storing the data packets (35 1 ) in a plurality of access units in a decoder 
buffer (132), each of the access units capable of holding at least one data packet associated 
with a selected frame in the streaming video; 

storing data packets (351) that are less immediately needed by the video 
decoder (134) in a first buffer region of the decoder buffer (132) comprising at least one 
20 access unit capable of storing data packets (351); and 

storing data packets (351) that are most inunediately needed by the video 
decoder (134) in a re-transmission region of the decoder buffer (132) composing at least one 
access unit, wherein the decoder buffer (132), in response to a detection of a missing data 
packet in the re-transmission region, requests that the streaming video transmitter retransmit 
25 the missing packet. 

16. The decoder buffer (132) set forth in Claim 15 wherein at least one of the data 
packets (35 1 ) are stored in the first buffer region for a period of time equal to a start-up delay 
time of the decoder buffer (132). 

30 

17. The decoder buffer (132) set forth in Claim 15 wherein the data packets (351) 
are first stored in the first buffer region and are shifted into the re-transmission region. 
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18. The decoder buffer (132) set fonh in Claim 15 wherein the first buffer region is 
separate from the re-transmission region. 

19. • The decoder buffer (132) set forth in Claim 15 wherein the first buffer region 
5 overlaps at least a ponion of the re-transmission region. 

20. The decoder buffer (132) set forth in Claim 19 wherein the first buffer region 
overlaps all of the re-transmission region. 

10 21. The decoder buffer (132) set forth in Claim 15 wherein the first buffer region is 

separated from the re-transmission region by a second buffer region in which a late data packet 
is late with respect to an expected time of arrival of the late data packet, but is not sufficiently 
late to require a re-transmission of the late data packet. 
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